A Variable Metric Probabilistic k-Nearest-Neighbours Classifier

نویسندگان

  • Richard M. Everson
  • Jonathan E. Fieldsend
چکیده

12:06 30th March 2004 Abstract The k -nearest neighbour (k -nn) model is a simple, popular classifier. Probabilistic k -nn is a more powerful variant in which the model is cast in a Bayesian framework using (reversible jump) Markov chain Monte Carlo methods to average out the uncertainy over the model parameters. The k -nn classifier depends crucially on the metric used to determine distances between data points. However, scalings between features, and indeed whether some subset of features is redundant, are seldom known a priori. Here we introduce a variable metric extension to the probabilistic k -nn classifier, which permits averaging over all rotations and scalings of the data. In addition, the method permits automatic rejection of irrelevant features. Examples are provided on synthetic data, illustrating how the method can deform feature space and select salient features, and also on real-world data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An empirical analysis of the probabilistic K-nearest neighbour classifier

The probabilistic nearest neighbour (PNN) method for pattern recognition was introduced to overcome a number of perceived shortcomings of the nearest neighbour (NN) classifiers namely the lack of any probabilistic semantics when making predictions of class membership. In addition the NN method possesses no inherent principled framework for inferring the number of neighbours, K, nor indeed assoc...

متن کامل

An Accuracy-Assured Privacy-Preserving Recommender System for Internet Commerce

Recommender systems, tool for predicting users’ potential preferences by computing history data and users’ interests, show an increasing importance in various Internet applications such as online shopping. As a well-known recommendation method, neighbourhood-based collaborative filtering has attracted considerable attentions recently. The risk of revealing users’ private information during the ...

متن کامل

An efficient weighted nearest neighbour classifier using vertical data representation

The k-nearest neighbour (KNN) technique is a simple yet effective method for classification. In this paper, we propose an efficient weighted nearest neighbour classification algorithm, called PINE, using vertical data representation. A metric called HOBBit is used as the distance metric. The PINE algorithm applies a Gaussian podium function to set weights to different neighbours. We compare PIN...

متن کامل

Pseudo-Likelihood Inference Underestimates Model Uncertainty: Evidence from Bayesian Nearest Neighbours

When using the K-nearest neighbours (KNN) method, one often ignores the uncertainty in the choice of K. To account for such uncertainty, Bayesian KNN (BKNN) has been proposed and studied (Holmes and Adams 2002 Cucala et al. 2009). We present some evidence to show that the pseudo-likelihood approach for BKNN, even after being corrected by Cucala et al. (2009), still significantly underest...

متن کامل

Extended k-Nearest Neighbours based on Evidence Theory

An evidence theoretic classification method is proposed in this paper. In order to classify a pattern we consider its neighbours, which are taken as parts of a single source of evidence to support the class membership of the pattern. A single mass function or basic belief assignment is then derived, and the belief function and the pignistic (“betting rates”) probability function can be calculat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004